A Competition Strategy to Cost-Sensitive Decision Trees

نویسندگان

  • Fan Min
  • William Zhu
چکیده

Learning from data with test cost and misclassification cost has been a hot topic in data mining. Many algorithms have been proposed to induce decision trees for this purpose. This paper studies a number of such algorithms and presents a competition strategy to obtain trees with lower cost. First, we generate a population of decision trees using λ-ID3 and EG2 algorithms through considering information gain and test cost. λ-ID3 is a generalization of three existing algorithms, namely ID3, IDX, and CS-ID3. EG2 is another parameterized algorithm, and its parameter range is extended in this work. Second, we post-prune these trees by considering the tradeoff between the test cost and the misclassification cost. Finally, we select the best decision tree for classification. Experimental results on the mushroom dataset with various cost settings indicate: 1) there does not exist an optimal parameter for λ-ID3 or EG2; 2) the competition strategy is effective in selecting an appropriate decision tree; and 3) post-pruning can help decreasing the average cost effectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cost-sensitive C4.5 with post-pruning and competition

Decision tree is an effective classification approach in data mining and machine learning. In applications, test costs and misclassification costs should be considered while inducing decision trees. Recently, some cost-sensitive learning algorithms based on ID3 such as CS-ID3, IDX, λ-ID3 have been proposed to deal with the issue. These algorithms deal with only symbolic data. In this paper, we ...

متن کامل

Cost-sensitive Decision Trees with Post-pruning and Competition for Numeric Data

Decision tree is an effective classification approach in data mining and machine learning. In some applications, test costs and misclassification costs should be considered while inducing decision trees. Recently, some cost-sensitive learning algorithms based on ID3, such as CS-ID3, IDX, ICET and λ-ID3, have been proposed to deal with the issue. In this paper, we develop a decision tree algorit...

متن کامل

Applying Stackelberg Game to Find the Best Price and Delivery Time Policies in Competition between Two Supply Chains

In this paper, the competition between two supply chains and their elements is studied. Each chain consisted of a manufacturer and a distributor and the two chains compete in a market with single type of customer sensitive to price and delivery time. Therefore, this is a two-supply chain game and during the competition between two supply chains, elements of each supply chain (manufacturer and/o...

متن کامل

Ensemble Classification and Extended Feature Selection for Credit Card Fraud Detection

Due to the rise of technology, the possibility of fraud in different areas such as banking has been increased. Credit card fraud is a crucial problem in banking and its danger is over increasing. This paper proposes an advanced data mining method, considering both feature selection and decision cost for accuracy enhancement of credit card fraud detection. After selecting the best and most effec...

متن کامل

Constructing Cost Sensitive Decision Trees Based on Multi-Objective Optimization

We propose a multi-objective optimization based on the cost sensitive decision tree building method. The misclassification cost, test cost, waiting time cost and information gain rate as four optimization goals by using the method of linear weighting are adopted to transfer the multiobjective optimization problem into a single objective optimization problem, as the splitting attribute selection...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012